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ABSTRACT 

Despite the identification of horseshoe bats as the reservoir of SARS-related- 
coronaviruses (SARSr-CoVs), the origin of SARS-CoV ORFS, which contains the 29-nt 
signature deletion among human strains, remains obscure. Although two SARSr-Rs- 
BatCoVs, RsSHC014 and Rs3367, previously detected from Chinese horseshoe bats 
(Rhinolophus sinicus) in Yunnan, possessed 95% genome identities to human/civet 
SARSr-CoVs, their ORF8 exhibited only 32.2-33% aa identities to that of human/civet 
SARSr-CoVs. To elucidate the origin of SARS-CoV ORFS8, we sampled 348 bats of 
various species in Yunnan, among which diverse alphacoronaviruses апа 
betacoronaviruses, including potentially novel CoVs, were identified, with some showing 
potential interspecies transmission. The genomes of two betacoronaviruses, SARSr-Rf- 
BatCoV YNLF ЗІС апа YNLF 34C, from greater horseshoe bats (Rhinolophus 
ferrumequinum), possessed 93% nt identities to human/civet SARSr-CoV genomes. 
Although they displayed lower similarities to civet SARSr-CoVs than SARSr-Rs- 
BatCoV RsSHCO014 and Rs3367 in S protein, their ORF8 demonstrated exceptionally 
high (80.4-81.3%) aa identities to that of human/civet SARSr-CoVs, compared to 
SARSr-BatCoVs from other horseshoe bats (23.2-37.3%). Potential recombination events 
were identified around ORF8 between SARSr-Rf-BatCoVs and SARSr-Rs-BatCoVs, 
leading to the generation of civet SARSr-CoVs. The expression of ORF8 subgenomic 
mRNA suggested that this protein may be functional in SARSr-Rf-BatCoVs. The high 
Ka/Ks ratio among human SARS-CoVs compared to SARSr-BatCoVs supported that 
ORFS is under strong positive selection during animal-to-human transmission. Molecular 


clock analysis using ORFlab showed that SARSs-Rf-BatCoV YNLF 31C and 
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YNLF 34C diverged from civet/human SARSr-CoVs at approximately 1990. SARS- 
CoV ORF8 is originated from SARSr-CoVs of greater horseshoe bats through 


recombination, which may be important for animal-to-human transmission. 
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IMPORTANCE 

Although horseshoe bats are the primary reservoir of SARS-related-coronaviruses 
(SARSr-CoVs), it is still unclear how these bat viruses have evolved to cross the species 
barrier to infect civet/human. Most human SARS-CoV epidemic strains contained a 
signature 29-nt deletion in ORF8 compared to civet SARSr-CoVs, suggesting that ORF8 
may be important for interspecies transmission. However, the origin of SARS-CoV ORF8 
remains obscure. In particular, SARSr-Rs-BatCoVs from Chinese horseshoe bats 
exhibited <40% aa identities to human/civet SARS-CoV in ORF8. We detected diverse 
alphacoronaviruses and betacoronaviruses among various bat species in Yunnan, 
including two SARSr-Rf-BatCoVs from greater horseshoe bats that possessed an ORF8 
with exceptionally high aa identities to that of human/civet SARSr-CoVs. We 
demonstrated recombination events around ORF8 between SARSr-Rf-BatCoVs and 
SARSr-Rs-BatCoVs, leading to the generation of civet SARSr-CoVs. Our findings offer 
insight into the evolutionary origin of SARS-CoV ORF8 which was likely acquired from 


SARSr-CoVs of greater horseshoe bats through recombination. 


© 
= 
6 
го 
(0) 
— 
(72) 
(9) 
jae 
— 
о. 
"ras 
о 
(72) 
2 
(= 
Š 
го 
e 
о. 
Ф 
(9) 
9) 
< 


Journal of Virology 


Journal of Virology 





73 


74 


75 


76 


77 


78 


79 


80 


81 


82 


83 


84 


85 


86 


87 


88 


89 


90 


91 


92 


93 


94 


95 


INTRODUCTION 

Coronaviruses (CoVs) are known to cause respiratory, enteric, hepatic and neurological 
diseases of varying severity in a variety of animals. They are currently classified into four 
genera, Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus, 
replacing the traditional three groups, group 1 to 3 (1-4). The genus Betacoronavirus 1s 
further classified into lineages A to D (3, 5, 6). Among CoVs that infect humans, human 
CoV 229E (HCoV 229E) and human CoV NL63 (HCoV NL63) belong to 
Alphacoronavirus; human CoV OC43 (HCoV OC43) and human CoV HKUI (HCoV 
HKUI) belong to Betacoronavirus lineage A; Severe Acute Respiratory Syndrome- 
related CoV (SARSr-CoV) belongs to Betacoronavirus lineage B; and the recently 
emerged Middle East Respiratory Syndrome CoV (MERS-CoV) belongs to 
Betacoronavirus lineage C (7-16). The high recombination rate, coupled with the 
infidelity of the RNA-dependent RNA polymerase (RdRp), may have facilitated CoVs to 
adapt to new hosts and ecological niches, causing epidemics in animals and humans (5, 
17-24). 

The SARS epidemic and identification of SARSr-CoVs from palm civet and 
horseshoe bats in China have boosted interests in the discovery of novel CoVs in both 
humans and animals especially bats (25-29). With the exception of lineage A 
betacoronaviruses, bats are now known to be an important reservoir of diverse 
alphacoronaviruses and lineage B, C and D betacoronaviruses (30-38), with bat CoVs 
being the gene source for other mammalian CoVs (4). In particular, the findings of bat 
CoVs related to SARS-CoV and MERS-CoV suggested that bats may be the animal 


origin of both SARS and MERS epidemics; while other animals have served as the 
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intermediate or amplifying hosts for animal-to-human transmission, palm civets in the 
case of SARS and dromedary camels in MERS (25, 27, 28, 39-41). However, the 
evolutionary paths from bat CoVs to CoVs capable of infecting intermediate hosts and 
humans are not fully understood. 

SARSr-CoVs have been detected in at least 11 different species of horseshoe bats 
(genus Rhinolophus) from various countries in Asia, Africa and Europe (27, 28, 35, 37, 
38, 42, 43). Related viruses have also been reported in bats of other genera, such as 
Chaerophon and Hipposideros, from Africa and China (43-45). However, it is still 
unclear how these bat CoVs have evolved to generate the ancestor of civet/human 
SARSr-CoVs capable of crossing the species barrier. The genome organization of 
SARSr-CoVs, similar to other CoVs, possessed the characteristic gene order 5’-open 
reading frame lab (ORF lab), spike (S), ORF3, envelope (E), membrane (М), ORF 6 to 8, 
nucleocapsid (N)-3'. It is known that most human SARS-CoVs during the epidemic 
contained a signature 29-nt deletion in ORF8 compared to civet SARSr-CoVs (25), 
suggesting that this genomic region may be important for interspecies transmission. 
However, the origin of SARS-CoV ORF8 remains obscure. Genomes of SARS-related 
Rhinolophus sinicus BatCoVs (SARSr-Rs-BatCoVs), previously designated SARSr-Rh- 
BatCoVs, from Chinese horseshoe bats (Rhinolophus sinicus) in Hong Kong and the 
Guangdong Province only shared 87-92% nucleotide (nt) identities to human/civet 
SARSr-CoV genomes (22, 27, 28). A subsequent study identified two SARSr-Rs- 
BatCoVs, RsSHC014 апа Rs3367, in the Yunnan Province, which were more closely 
related to human/civet SARSr-CoVs (with 95% genome sequence identities) than any 


other SARSr-BatCoVs (42). The S proteins of these two SARSr-Rs-BatCoVs from 
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Yunnan shared 90.1-92.3% amino acid (aa) identities to those of human/civet SARSr- 
CoVs, compared to 79-80% aa identities between SARSr-Rs-BatCoVs from Hong Kong 
and human/civet SARSr-CoVs (27, 42). Moreover, a highly similar virus, SARSr-Rs- 
BatCoV WIV 1, isolated in Vero E6 cells, was able to use angiotensin converting enzyme 
II (ACE2) from humans, civets, and Chinese horseshoe bats as receptor for cell entry, 
suggesting that intermediate hosts between bats and human/civets may not be necessary 
for interspecies transmission (42). However, considerable genetic distance still exists 
between the two SARSr-Rs-BatCoVs from Yunnan and human/civet SARSr-CoVs, 
especially in the ORF8 region with only 32.2-33% aa identities. 

To elucidate the evolutionary origin of SARS-CoV ORFS and search for even 
closer bat CoV ancestors of SARS-CoV, we conducted a three-month study (May to July 
2013) on CoVs among various bats from different regions of the Yunnan Province. 
Diverse CoVs were detected, including two SARS-related Rhinolophus ferrumequinum 
BatCoVs (SARSr-Rf-BatCoVs) from greater horseshoe bats (Rhinolophus 
ferrumequinum), which possessed an expressed ORF8 much more closely related to 
human/civet SARSr-CoVs than CoVs detected from other bat species. Recombination 
and molecular clock analysis were also performed to elucidate the evolutionary paths and 


time of interspecies transmission of SARSr-CoVs. 
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MATERIALS AND METHODS 

Ethics statement. The collection of bat samples was approved and performed by the 
Yunnan Institute of Endemic Diseases Control and Prevention, Dali, Yunnan, China. All 
bats were maintained and handled using standard procedures approved by the Medical 
Ethical Committee of Yunnan Institute of Endemic Diseases Control and Prevention, 
China. 

Sample collection. Bats were captured from various locations in five counties of 
four prefectures of the Yunnan Province, China from May to July 2013 (Fig. 1). Samples 
were collected using procedures described previously (27, 46). All samples were placed 
in viral transport medium (Earle’s balanced salt solution, 0.09% glucose, 0.03% sodium 
bicarbonate, 0.45% bovine serum albumin, 50 mg/ml amikacin, 50 mg/ml vancomycin, 
40 U/ml nystatin) and stored at -80°C before RNA extraction. 

RNA extraction. Viral RNA was extracted from alimentary samples using 
OlAamp Viral RNA Mini Kit (ОТАсеп, Hilden, Germany). The RNA was eluted in 50 Ш 
of AVE buffer and was used as the template for RT-PCR. 

RT-PCR for CoVs and DNA sequencing. CoVs screening was performed by 
amplifying a 440-bp fragment of the RdRp gene of CoVs using conserved primers (5’- 
GGTTGGGACTATCCTAAGTGTGA-3' and 5- 
ACCATCATCNGANARDATCATNA-3’) targeted to RdRp genes of CoVs (12). 
Reverse transcription was performed using the SuperScript III kit (Invitrogen, Life 
Technologies, Grand Island, NY, USA). The PCR mixture (25 ul) contained cDNA, PCR 
buffer (10 mM Tris-HCl pH 8.3, 50 mM KCl, 3 mM MgCl, and 0.01% gelatin), 200 uM 


of each dNTPs and 1.0 U Taq polymerase (Applied Biosystems, Life Technologies, 
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Grand Island, NY, USA). The mixtures were amplified in 40 cycles of 94?C for 1 min, 
489С for 1 min and 72?C for 1 min and a final extension at 72?C for 10 min in an 
automated thermal cycler (Applied Biosystems). Standard precautions were taken to 
avoid PCR contamination and no false-positive was observed in negative controls. 

The PCR products were gel-purified using the OQlAquick gel extraction kit 
(ОТАреп). Both strands of the PCR products were sequenced twice with an ABI Prism 
3700 DNA Analyzer (Applied Biosystems), using the two PCR primers. The sequences 
of the PCR products were compared with known sequences of the RdRp genes of CoVs 
in the GenBank database. Phylogenetic tree was constructed using the 266-bp fragments 
of the RdRp gene with maximum likelihood method using substitution model of General 
Time Reversible model with Gamma Distribution as well as allowance of evolutionarily 
invariable sites (GTR+G+I) by MEGA 5.0 (47). 

Viral culture. The two samples positive for SARSr-Rf-BatCoVs were subject to 
virus isolation in Vero E6 (African green monkey kidney) and primary R. sinicus lung 
cells as described previously (21). 

Complete genome sequencing and analysis of SARSr-Rf-BatCoVs. Two 
complete genomes of SARSr-Rf-BatCoVs were amplified and sequenced using RNA 
extracted from the alimentary samples as templates. RNA was converted to cDNA by a 
combined random-priming and oligo(dT) priming strategy. The cDNA was amplified by 
degenerate primers as described previously (27). A total of 75 sets of primers, available 
on request, were used for PCR. The 5’ end of the viral genome was confirmed by rapid 
amplification of cDNA ends using the 5'/3' SMARTer'" RACE cDNA Amplification Kit 


(Clontech, USA). Sequences were assembled and manually edited to produce the final 
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sequences. The nt sequences of the genomes and the deduced aa sequences of the ORFs 
were compared to those of other CoVs using the coronavirus database CoVDB (48). 
Phylogenetic tree construction was performed using maximum likelihood method with 
MEGA 6.0. 

Recombination analysis. To detect possible recombination between different 
SARSr-BatCoVs and civet SARSr-CoVs, sliding window analysis was performed using 
nt alignment of the available genome sequences generated by ClustalX version 1.83 and 
edited manually with BioEdit version 7.1.3. Similarity Plot analysis and Bootscan 
analysis were performed using Simplot version 3.5.1 (49) (F84 model; window size, 1000 
bp; step, 200 bp) with civet SARSr-CoV 523 as query. 

Estimation of synonymous and non-synonymous substitution rates. The 
number of synonymous substitutions per synonymous site, Ks, and the number of non- 
synonymous substitutions per non-synonymous site, Ka, for each coding region were 
calculated for all available SARSr-Rf-BatCoV, SARSr-Rs-BatCoV, civet SARSr-CoV 
and human SARSr-CoV genomes using the Nei-Gojobori method (Jukes-Cantor) in 
MEGA 5.0. 

Estimation of divergence dates. The tMRCA was estimated based on an 
alignment of ORF lab and nsp5 sequences, using the Uncorrelated exponential distributed 
relaxed clock model (UCED) in BEAST version 1.8 (http://evolve.zoo.ox.ac.uk/beast/) 
(50). Under this model, the rates were allowed to vary at each branch drawn 
independently from an exponential distribution. The sampling dates of all strains were 
collected from the literature or from the present study, and were used as calibration points. 


Depending on the data set, Markov chain Monte Carlo (MCMC) sample chains were run 
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for 2 x 10° states, sampling every 1,000 generations under the GTR nt substitution model, 
determined by MODELTEST and allowing y-rate heterogeneity for all data sets. А 
constant population coalescent prior was assumed for all data sets. The median and HPD 
were calculated for each of these parameters from two identical but independent MCMC 
chains using TRACER 1.3 (http://beast.bio.ed.ac.uk). The tree was annotated by 
TreeAnnotator, a program oof  BEAST апа displayed Бу  FigTree 


(http://tree.bio.ed.ac.uk/software/figtree/). 





Expression of ORF8 and determination of leader-body junction sequence. 
The leader-body junction site and flanking sequences of the ORF8 subgenomic mRNA in 
SARSr-Rf-BatCoV YNLF_31C were determined using RT-PCR as described previously 
(21, 51). Briefly, RNA was extracted directly from the bat samples using TRIzol Reagent 
(Invitrogen). Reverse transcription was performed using random hexamers and the 
SuperScript III kit (Invitrogen). cDNA was PCR amplified with a forward primer (5’- 
CTACCCAGGAAAAGCCAAC-3?) located in the leader sequence and a reverse primer 
(5’-TGAACCATAGTGTGCCATCT-3’) located in the body of the ORF8 mRNA. The 
PCR mixture (25 ul) contained cDNA, PCR buffer (10 mM Tris-HCl pH 8.3, 50 mM KCI, 
2 mM МӘСІ; and 0.01% gelatin), 200 UM of each dNTPs and 1.0 U Тад polymerase 
(Applied Biosystems). The mixtures were amplified in 60 cycles of 94?C for 1 min, 509С 
for 1 min and 72°C for 1 min and a final extension at 72°C for 10 min in an automated 
thermal cycler (Applied Biosystems). RT-PCR products were subject to agarose gel 
electrophoresis gel-purified using QIAquick gel extraction kit (ОТА gen) and sequenced to 


obtain the leader-body junction sequences for the ORF8 subgenomic mRNA. 


12 


228 Nucleotide sequence accession numbers. The nt and genome sequences of the 
229  CoVs detected in this study have been lodged within the GenBank sequence database 


230 under accession no. KP886808, KP886809, and KP895482 to KP895525. 


Accepted Manuscript Posted Online 


>- 

(op) 
-9 
15 
> 
—— 
О 
75 
Е 
мш 
> 
о 
-9 


13 


Journal of Virology 





© 
= 
6 
го 
(0) 
— 
(72) 
(9) 
jae 
— 
о. 
"ras 
о 
(72) 
2 
(= 
Š 
го 
e 
о. 
Ф 
(9) 
9) 
< 


Journal of Virology 


Journal of Virology 





231 


232 


233 


234 


235 


236 


237 


238 


239 


240 


241 


242 


243 


244 


245 


246 


247 


248 


249 


250 


251 


252 


253 


RESULTS 

Detection of CoVs in bats. А total of 348 alimentary samples from bats belonging to 
five different genera were obtained from various regions of the Yunnan province. RT- 
PCR for a 440-bp fragment of the RdRp gene of CoVs was positive in alimentary 
samples from 46 bats of five species belonging to four genera (Table 1, Fig. 1). Sequence 
analysis showed that 35 samples contained diverse alphacoronaviruses, while 11 samples 
contained betacoronaviruses, including two lineage B betacoronaviruses and nine 
lineage D betacoronaviruses. 

Detection of diverse bat alphacoronaviruses. Phylogenetic analysis of the 440- 
bp fragments of the RdRp gene of alphacoronaviruses detected in 35 bat samples showed 
that two sequences from one Rhinolphus stheno and one Myotis daubentonii captured in 
Mojiang possessed 92-9394 nt identities to Rhinolophus bat CoV HKU2 (Rh-BatCoV 
HKU2) (GenBank accession no. NC 009988.1) (Table 1, Fig. 2). Four sequences from M. 
daubentonii in Xiangyun possessed 81% nt identity to Rh-BatCoV HKU2 (GenBank 
accession no. NC 009988.1). Twenty-four sequences from M. daubentonii in Xiangyun 
possessed 78-99% nt identities to Myotis bat CoV HKU6 (My-BatCoV HKU6) (GenBank 
accession no. DQ249224.1). Two sequences from M. daubentonii in Mojiang possessed 
96% nt identities to Miniopterus bat CoV HKU7 (Mi-BatCoV HKU7) (GenBank 
accession no. DQ249226.1). One sequence from M. daubentonii in Mojiang possessed 
96% nt identities to Miniopterus bat CoV HKUS (GenBank accession no. МС 010438.1). 
Two sequences from Hipposideros Pomona in Mojiang possessed 81-87% nt identities to 
Hipposideros bat CoV HKUIO (Hi-BatCoV HKUIO) (GenBank accession no. 


JQ989267.1). 
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Detection of lineage B and D bat betacoronaviruses. Phylogenetic analysis of 
the 440-bp fragments of the RdRp gene of betacoronaviruses detected in two bat samples, 
YNLF 31C and YNLF 34C, showed that they belonged to Betacoronavirus lineage B, 
with 100% nt identities to human SARS-CoV  TOR2 (GenBank accession no. 
AY274119.3) and 90% nt identities to SARSr-Rs-BatCoV HKU3 (GenBank accession no. 
DQ022305), thus representing SARSr-Rf-BatCoVs (Table 1, Fig. 2). Both samples were 
collected from greater horseshoe bats (Rhinolophus ferrumequinum) captured in Lufeng 
County, Chuxiong Yi Autonomous Prefecture (Fig. 1). Phylogenetic analysis of the 440- 
bp fragments of the RdRp gene of betacoronaviruses detected in nine other bat samples 
showed that they belonged to Betacoronavirus lineage D, with 75-79% nt identities to 
Rousettus bat coronavirus HKU9 (Ro-BatCoV HKUO9) (GenBank accession no. 
NC 009021.1). These nine samples were collected from Leschenault’s rousettes 
(Rousettus leschenaulti) in Mengla County, Xishuangbanna Dai Autonomous Prefecture. 
Attempts to passage SARSr-Rs-BatCoV YNLF 31C and YNLF 34C in various cell lines 
were not successful, with no cytopathic effect or viral replication being detected. 

Genome comparison between SARSr-Rf-BatCoV and other SARSr-CoVs. 
The complete genome sequences of the two SARSr-Rf-BatCoV strains, YNLF 31C and 
YNLF 34C, were obtained by assembly of the sequences of RT-PCR products obtained 
directly from alimentary samples. Their genome sizes were 29723 bases, with G C 
content 40.7%, comparable to those of most other SARSr-CoVs (27, 28). They were 
closely related to each other with 99.9% overall nt identities, while they possessed 88.2% 
nt identities to the genomes of SARSr-Rs-BatCoV HKU3 and 93% nt identities to the 


genomes of human/civet SARSr-CoVs. SARSr-Rf-BatCoV strains share similar genome 
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organization with other SARSr-CoV strains, containing the putative transcription 
regulatory sequence (TRS) motif, 5'-ACGAAC-3', at the 3’ end of the 5’ leader sequence 
and preceding each ORF except ORF 7b. Similar to most other SARSr-BatCoVs, SARSr- 
Rf-BatCoV YNLF ЗІС and YNLF 34C contained a single long ОКЕ8. 

The nsp3, S, ORF3 and ORFS regions are known to be the most rapidly evolving 
regions among SARSr-CoV genomes (27, 28, 52, 53). Pairwise comparison of aa 
sequences between civet SARSr-CoV SZ3 and other SARSr-CoVs showed that the S and 
ORF3a of SARSr-Rf-BatCoV YNLF 31C and YNLF 34C displayed relatively low 
sequence identities to civet SARSr-CoV (Table 2). However, the nsp3 of SARSr-Rf- 
BatCoV YNLF 31C and YNLF 34C exhibited 97.1% aa identity to civet SARSr-CoV, 
which is comparable to the high sequence identity of 96.8 to 97.5% between civet 
SARSr-CoV and SARSr-BatCoVs, Rs3367, RsSHC014, WIV1 and BtCoV-Cp/2011, 
from Yunnan reported previously (42). Furthermore, an exceptionally high sequence 
identity (80.4-81.3% aa identity) was observed in the ORF8 between SARSr-Rf-BatCoVs 
and human/civet SARSr-CoVs, much higher than that between human/civet SARSr- 
CoVs and other SARSr-BatCoVs (23.2-37.3% aa identity). Therefore, civet SARSr-CoV 
SZ3 was most closely related to SARSr-Rs-BatCoV Rs3367 апа WIV1 in S and ORF3a, 
but was most closely related to SARSr-Rf-BatCoVs іп ORF8. 

The predicted receptor binding domain (RBD) of SARSr-Rf-BatCoV YNLF 31C 
and YNLF 34C possessed 89% and 68.1% aa identities to that of SARSr-Rs-BatCoV 
HKU3-1 and civet SARSr-CoV SZ3 respectively. Previous studies have identified five 
critical residues (residues 442, 472, 479, 487 and 491) for ACE2 binding in human and 


civet SARSr-CoVs (54). In particular, residues 479 and 487 are the two key residues that 
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are different between human and civet SARSr-CoV strains, with S—T substitution at 
residue 487 resulting in 20-fold reduction in human ACE2 binding affinity (54). In 
SARSr-Rs-BatCoV Rs3367, two (residues 479 and 491) of the five critical residues were 
conserved. In SARSr-Rf-BatCoVs and most other SARSr-Rs-BatCoVs, only residue 491 
was conserved (Fig. 3). Compared to human/civet SARSr-CoVs and SARSr-Rs-BatCoV 
Rs3367, WIV1 and RsSHC014, the RBD of SARSr-Rf-BatCoV YNLF 31C and 
YNLF 34C, similar to some SARSr-BatCoV strains, contained two deletions of 5 aa and 
12 aa respectively. 

Phylogenetic analysis. Phylogenetic trees were constructed using nsp2, nsp3, 
nsp5, п5р12 (RdRp), S, ORF3a, ORF8 and М of SARSr-Rf-BatCoV YNLF 31C and 
YNLF 34C and other SARSr-CoVs (Fig. 4). These regions were selected because they 
were commonly used in phylogenetic analysis of CoVs (RdRp, S, N), represent regions 
of rapid evolution in SARSr-CoVs (nsp3, ORF3, ORF8), or free from recombination 
upon subsequent analysis (nsp2, nsp5). In nsp2, nsp3, nsp5, RdRp, and N genes, SARSr- 
Rf-BatCoV YNLF 31C and YNLF 34C were more closely related to other SARSr- 
BatCoVs than to two other SARSr-Rf-BatCoV strains, КЇЇ and BtCoV/273/2005, 
previously detected from greater horseshoe bats in Hubei (28, 37). However, in 5, ORF3 
and ORF8, SARSr-Rf-BatCoV YNLF 31C and YNLF 34C were most closely related to 
SARSr-Rf-BatCoV Rfl and BtCoV/273/2005, forming a distinct cluster among other 
SARSr-BatCoVs. 

In S and ORF3 region, human/civet SARSr-CoVs were most closely related to 
SARSr-Rs-BatCoV Rs3367, WIVI and RsSHC014 previously detected from Yunnan 


bats (42). This is in line with the ability of SARSr-Rs-BatCoV WIV1 to replicate in 
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VeroE6 cells and use ACE2 as receptor (42). In nsp3, human/civet SARSr-CoVs were 
most closely related to SARSr-Rf-BatCoV YNLF 31C and YNLF 34C as well as 
SARSr-Rs-BatCoV Rs3367, WIV1 and RsSHC014. Furthermore, in ORF8, SARSr-Rf- 
BatCoV strains were clustered with human and civet SARSr-CoV strains with high 
bootstrap value of 990, whereas all SARSr-Rs-BatCoV strains, including Rs3367, WIV1 
and RsSHC014, formed another cluster. This concurs with results from pairwise аа 
sequence comparison, and suggests that the ORF8 of civet and human SARSr-CoV was 
originated from SARSr-Rf-BatCoVs from greater horseshoe bats instead of SARSr-Rs- 
BatCoV from Chinese horseshoe bats. 

Recombination analysis. Since the ORF8 of SARSr-Rf-BatCoVs showed high 
sequence identity to those of human/civet SARSr-CoVs, we hypothesize that the ancestor 
of civet SARSr-CoVs has acquired its ORF8 from SARSr-Rf-BatCoVs through 
recombination between SARSr-Rf-BatCoVs from greater horseshoe bats and SARSr-Rs- 
BatCoVs from Chinese horseshoe bats. When civet SARSr-CoV SZ3 was used as the 
query for sliding window analysis with SARSr-Rf-BatCoV YNLF_31C and SARSr-Rs- 
BatCoV Rs3367 and HKU3 as potential parents, several recombination breakpoints were 
observed. In particular, two breakpoints, between which ORF8 was located, were 
identified (Fig. 5). Downstream to the first breakpoint at position 27128 and upstream to 
the second breakpoint at position 28635, an abrupt change in clustering occurred with 
high bootstrap support for clustering of civet SARSr-CoV SZ3 with SARSr-Rf-BatCoV 
YNLF 31C. This is in line with results from phylogenetic and similarity plot analysis. 
Moreover, using multiple alignments, civet SARSr-CoV 5273 was shown to possess much 


higher sequence similarities to SARSr-Rf-BatCoVs than to SARSr-Rs-BatCoVs within 
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ORF8 which includes the region corresponding to the 29-nt deletion found in human 
SARS-CoVs (Fig. 5). 

Besides ORF8, another region of interest was S which was situated between two 
breakpoints at position 20900 and 26100 respectively (Fig. 5). Downstream to position 
20900 and upstream to position 26100, an abrupt change in clustering occurred with high 
bootstrap support for clustering of civet SARSr-CoV SZ3 with. SARSr-Rs-BatCoV 
Rs3367. This is in line with phylogenetic analysis and the ability of strain Rs3367 to use 
АСЕ2 as receptor for cellular entry (42). However, similarity plot analysis still showed 
substantial difference between the S of civet SARSr-CoV SZ3 and SARSr-Rs-BatCoV 
Rs3367, especially in the S1 region. 

Estimation of synonymous and non-synonymous substitution rates. Using all 
available SARSr-BatCoV genome sequences for analysis, the Ka/Ks ratios for various 
coding regions, as compared to those of civet SARSr-CoVs and human SARS-CoVs, аге 
shown in Table 3. Notably, the Ka/Ks ratios for most coding regions of SARSr-BatCoVs, 
including ORF8 of SARS-Rf-BatCoVs, were low, supporting purifying selection. In 
contrast, many regions of civet SARSr-CoVs and human SARS-CoVs exhibited 
relatively high Ka/Ks ratios suggestive of positive selection. Positive selection was 
particularly strong at the S (Ka/Ks=3) and ORF3 (Ka/Ks-2) of civet SARSr-CoVs, and 
the M (Ka/Ks-2) апа ОКЕВ (Ka/Ks=3.5) of human SARS-CoVs. 

Estimation of divergence dates. Using the uncorrelated relaxed clock model on 
ORF lab, the time of the most recent common ancestor (МЕСА) of all SARSr-CoVs was 
estimated to be 1960.1 [highest posterior density regions at 95% (HPD), 1899.1 to 


1988.6]. The tMRCA of human and civet SARSr-CoVs was estimated to be 2001.5 
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(HPDs, 1999.1 to 2002.5), approximately 2 years before the SARS epidemic. The 
tMRCA of human/civet SARSr-CoVs, SARSr-Rp-BatCoV Rp3/2004, and SARSr-Rs- 
BatCoV RsSHC014/2011, Rs3367/2012 апа WIV1/2012 was estimated to be 1995.3 
(HPDs, 1984.5 to 2001), while that of human/civet SARSr-CoVs, and SARSr-Rf- 
BatCoVs, was estimated to be 1990.6 (HPDs, 1973.2 to 1999.6) (Fig. 6). 

Since some regions in ORFlab may be involved in recombination (Fig. 5), nsp5, 
which was free from recombination, was also used for analysis and showed similar tree 
topology. Using the uncorrelated relaxed clock model on nsp5, the time of the most 
recent common ancestor (tMRCA) of all SARSr-CoVs was estimated to be 1961.5 
[highest posterior density regions at 95% (HPD), 1898.9 to 1991.5]. The tMRCA of 
human and civet SARSr-CoVs was estimated to be 2000.7 (HPDs, 1996.7 to 2002.6), 
approximately 2 years before the SARS epidemic. The tMRCA of human/civet SARSr- 
CoVs, SARSr-Rp-BatCoV  Rp3/2004, and SARSr-Rs-BatCoV RsSHC014/2011, 
Rs3367/2012 and WIV1/2012 was estimated to be 1996.3 (HPDs, 1985.2 to 2001.7), 
while that of human/civet SARSr-CoVs, and SARSr-Rf-BatCoVs, was estimated to be 
1989.9 (HPDs, 1969.6 to 2000.3) (Fig. 6) The estimated mean substitution rates of the 
ORF lab and nsp5 data set under the uncorrelated exponentially distributed relaxed clock 
model (UCED) were 2.00 x10? and 1.36 x10? substitution per site per year respectively, 
which are comparable to other CoVs and RNA viruses (55, 56). 

Expression of ORF8 and determination of leader-body junction sequence. 
CoVs are characterized by a unique mechanism of discontinuous transcription with the 
synthesis of a nested set of subgenomic mRNAs (1, 2). To determine if ORFS is 


expressed in SARSr-Rf-BatCoV and the location of the leader and body TRS used for 
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mRNA synthesis, the leader-body junction sites and flanking sequences of ORF8 
subgenomic mRNA were determined. The obtained subgenomic mRNA sequence was 
aligned to the leader sequence which confirmed the core sequence of the TRS motifs as 
5'-ACGAAC-3' (Fig. 7), as in other SARSr-CoVs. The leader TRS and the ORF8 
subgenomic mRNA exactly matched each other. The SARSr-Rf-BatCoV leader was 


confirmed as the first 66 nt(s) of the genome. 
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DISCUSSION 

The ORFS of civet SARSr-CoV is likely to have been acquired from SARSr-Rf-BatCoVs 
in greater horseshoe bats (R. ferrumequinum) through recombination. In this study, two 
SARSr-Rf-BatCoV strains, YNLF 31C and YNLF 34C, were identified from greater 
horseshoe bats. Although their genomes only possessed 93% nt identities to the genomes 
of human/civet SARSr-CoVs, which is lower than the 95% nt identities between 
human/civet SARSr-CoV and SARSr-Rs-BatCoVs, Rs3367 апа Е55НС014, from 
Chinese horseshoe bats in Yunnan, the nsp3 and ORF8 of SARSr-Rf-BatCoV 
YNLF 31C and YNLF 34C exhibited the highest aa identities among all SARSr- 
BatCoVs to that of civet SARSr-CoV 5273. In particular, their ORF8 demonstrated much 
higher aa identities (81.3%) to civet SARSr-CoV 573 than SARSr-BatCoVs from other 
horseshoe bats (23.2% to 37.3%). Phylogenetic analysis of the ORF8 revealed a distinct 
clade formed by human/civet SARSr-CoVs and SARSr-Rf-BatCoVs separate from other 
SARSr-BatCoVs. This is in line with a previous report showing that the ORF8 of SARSr- 
Rf-BatCoV Rfl was clustered with human/civet SARSr-CoVs but not SARSr-BatCoV 
Rml and Rp3 upon phylogenetic analysis, although only one SARSr-Rf-BatCoV strain 
was available for analysis (28). Moreover, potential recombination sites were identified 
between SARSr-Rf-BatCoVs and SARSr-Rs-BatCoVs around the ORF8 region, leading 
to the generation of civet SARSr-CoV SZ3 with the ORF8 acquired from SARSr-Rf- 
BatCoVs. Similar to other regions of the genome, the ORF8 of SARSr-Rf-BatCoVs has 
been under purifying selection, which supports greater horseshoe bats as a reservoir for 
SARSr-Rf-BatCoVs. In contrast, the ORF8 of human SARS-CoVs was under strong 


positive selection, which reflects the rapid evolution soon after interspecies jumping. 


22 


© 
= 
6 
го 
(0) 
— 
(72) 
(9) 
jae 
— 
о. 
"ras 
о 
(72) 
2 
(= 
Š 
го 
e 
о. 
Ф 
(9) 
9) 
< 


Journal of Virology 


Journal of Virology 





422 


423 


424 


425 


426 


427 


428 


429 


430 


431 


432 


433 


434 


435 


436 


437 


438 


439 


440 


441 


442 


443 


444 


These findings supported that recombination is the key mechanism involved in the 
acquisition of ORF8 by the ancestor of civet SARSr-CoVs. In fact, previous studies have 
demonstrated frequent recombination events between SARSr-Rs-BatCoV strains from 
different bat species of different geographical locations in China (22, 55). Moreover, a 
recombination breakpoint at nsp16/S intergenic region was detected between SARSr-Rp- 
BatCoV Rp3 from Pearson’s horseshoe bats (Rhinolophus pearsoni) and SARSr-Rf- 
BatCoV Rfl during the evolution of SARSr-BatCoVs to civet SARSr-CoV (22). On the 
other hand, some genomic regions of SARSr-Rf-BatCoV YNLF 31C and YNLF 34C, 
such as nsp3, RdRp and N, were evolutionarily distinct from two previously reported 
SARSr-Rf-BatCoV strains, КЇЇ and 273/2005, upon phylogenetic analysis. This suggests 
that SARSr-Rf-BatCoVs from different geographical locations in China may have 
evolved separately through other recombination events. The present findings offer new 
insights into the origin and evolution of SARS-CoV, by showing that the ancestor of civet 
SARSr-CoV is a likely recombinant virus with ORFS8 originated from SARSr-Rf- 
BatCoVs in greater horseshoe bats and other genome regions from different horseshoe 
bats. 

Although SARSr-Rs-BatCoV Rs3367 and RsSHCO014 represented the closest bat 
CoVs to SARS-CoV in terms of genome identity, they were unlikely the immediate 
ancestor of civet SARSr-CoVs. Previous molecular-dating studies estimated that the time 
of divergence between human/civet and bat SARSr-CoVs ranged from 4 to 17 years 
before the SARS epidemic (22, 55, 57). SARSr-CoVs were also shown to be a newly 
emerged subgroup of Betacoronavirus, with the median date of their MRCA estimated to 


be from 1961 to 1982 (55, 57). The present results are in line with such estimations, with 
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the tMRCA between human/civet and closest bat strains estimated to be approximately 
1995 (8 years before the SARS epidemic) and that among all SARSr-CoVs 
approximately 1960 using ORF lab. Similar results were also obtained when using nsp5 
region which was recombination-free. Moreover, we demonstrated that SARSs-Rf- 
BatCoV YNLF 31C and YNLF 34C only diverged from civet/human SARSr-CoVs at 
approximately 1990. This is in contrast to previous studies that showed SARSr-Rp- 
BatCoV Rp3 as the only recently diverged strain (55, 57). Together with the evidence on 
the acquisition of ORFS, it is likely that civet SARSr-CoV is originated from 
recombination between SARS-Rs-BatCoVs and SARS-Rf-BatCoVs from different 
horseshoe bat species within several years before the SARS epidemic. 

The overlapping habitat and geographical distribution of different horseshoe bats 
may have fostered recombination between different SARSr-BatCoVs and emergence of 
SARS-CoV. Chinese horseshoe bats are widely distributed throughout China including 
Yunnan, Guangdong and Hong Kong. While greater horseshoe bats are also widely 
distributed across different provinces in China including Yunnan, they are not found in 
Guangdong (58). The two bat species shared similar diet and habits such as the ability to 
roost in man-made structures, suggesting that they may co-habitat in similar 
environments in Yunnan, the province with the highest biodiversity in China. In fact, 
SARSr-Rf-BatCoV YNLF 31C and YNLF 34C, and SARSr-Rs-BatCoV Rs3367 and 
RsSHCO014 were detected in Lufeng and Kunming of the Yunnan province respectively, 
which were only ~80 km apart and within the migration distances of horseshoe bats (Fig. 
1) (22, 59, 60). Since greater horseshoe bats are not found in Guangdong, recombination 


between SARSr-Rf-BatCoVs and SARS-Rs-BatCoVs with the generation of the ancestor 
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of civet SARSr-CoVs may have occurred in yet unidentified bats in Yunnan or nearby 
provinces, which were then transported to wildlife markets in Guangdong and infected 
civets. Alternatively, recombination may have occurred in civets or other animals within 
wildlife farms or markets where many different wild animal species are often housed 
together (61). A possible scenario 1s that the animals were co-infected with SARSr-Rf- 
BatCoVs and SARSr-Rs-BatCoVs from different horseshoe bats, followed by 
recombination events. More extensive surveillance in bats from Yunnan and neighboring 
provinces, as well as wildlife markets in Guangdong may reveal the immediate ancestor 
of civet SARSr-CoVs. 

The ORF8 region, unique to SARSr-CoVs, is prone to mutations or deletions 
during interspecies transmission. One of the most striking genomic changes observed in 
SARS-CoV soon after its zoonotic transmission to humans was the acquisition of a 
characteristic 29-nt deletion which splits ORF8 into two ORFs, ORF8a and ORF8b (25, 
62). While SARS-CoVs isolated from the later human cases of the epidemic contained 
this 29-nt deletion, isolates from civets and some early human cases possessed a single 
continuous ORF8 (25, 63). Besides, some early human strains and a farmed civet strain 
from Hubei possessed an alternative 82-nt deletion in ORF8 (63). On the other hand, four 
late human isolates possessed a 415-nt deletion, resulting in the loss of the entire ORF8 
(63). Although studies using reverse genetics showed that the ORF8 is not essential for 
virus replication in vitro and in vivo (64, 65), the full-length 8ab protein is a functional 
protein that is delivered by a cleavable signal sequence to the lumen of the endoplasmic 
reticulum where it becomes N-glyosylated (62). Different subcellular localizations and 


functions have also been reported for 8ab, 8a and 8b proteins (66-69). Inside the 
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endoplasmic reticulum, 8ab activates the ATF6 branch of unfolded-protein response (70). 
The 8a protein enhances SARS-CoV replication and induces caspase-dependent apoptosis 
through a mitochondria-dependent pathway (66). Moreover, antibodies against 8a protein 
have been detected in sera of SARS patients (66). The 8b protein down-regulates the 
expression of the E protein, which supported a modulatory role in viral replication (68). 
Moreover, overexpression of the 8b protein induces DNA synthesis (67). The 8b and 8ab 
proteins also play a role in the host ubiquitin-proteasome system (71). In this study, the 
expression of ORF8 subgenomic mRNA in SARSr-Rf-BatCoV YNLF 31C suggested 
that this protein may also be functional in SARSr-BatCoVs. Moreover, the high Ka/Ks 
ratio among human SARS-CoVs compared to SARSr-BatCoVs supported that ORF8 is 
subject to rapid evolution under strong positive selection during animal-to-human 
transmission. Further studies may help understand the importance of ORF8 evolution for 
interspecies transmission of SARSr-CoVs. 

Besides SARSr-BatCoVs, diverse alphacoronaviruses апа betacoronaviruses, 
including potentially novel CoVs, with potential interspecies transmission events were 
identified in this study. Bats are known important reservoirs of lineage B, C and D 
betacoronaviruses, while rodents are likely the reservoir of lineage A betacoronaviruses 
(30). Nine samples belonging to lineage D betacoronaviruses were detected in 
Leschenault’s rosettes (А. leschenaulti), a known reservoir of Ro-BatCoV HKUO (24). 
However, the partial RdRp sequences only possessed 75-79% nt sequences to the latter, 
suggesting that they may represent either novel CoV species or novel genotype of Ro- 
BatCoV HKU9. As for alphacoronaviruses, 24 samples from Daubenton's bats (M. 


daubentonii) contained viruses most closely related to My-BatCoV HKU6 with 78-99% 
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nt identities in the partial RdRp region, which may represent My-BatCoV HKU6 ог 
related viruses previously reported in the same bat species (38). Six samples contained 
alphacoronaviruses most closely related to Rh-BatCoV HKU2. However, four samples 
(YNXY 7C, YNXY 10C, YNXY 45 and YNXY 50C) from Daubenton’s bats 
possessed partial RdRp sequences of only 80-8096 nt identities to that of Rh-BatCoV 
HKU2, suggesting that they may represent novel CoVs. Although the other two samples 
(МЈ 27C and MJ 69C) possessed RdRp sequences with 92-93% identities to that of Rh- 
BatCoV HKU2, they were detected from Daubenton’s bats and lesser brown horseshoe 
bats (R. stheno) instead of Chinese horseshoe bats (R. sinicus) previously reported to 
carry Rh-BatCoV HKU2 (34). This may suggest interspecies transmission of Rh-BatCoV 
HKU2 among different bat species. Two samples from Pomona roundleaf bats 
(Hipposideros Pomona) contained alphacoronaviruses most closely related to Hi- 
BatCoV HKUI0. However, the partial RdRp sequences only possessed 81-87% nt 
identity to the latter. We have previously described recent interspecies transmission of 
BatCoV HKUIO between Leschenault's rousettes (R. leschenaulti) and Pomona 
roundleaf bats, two very different bats belonging to different families, through rapid 
evolution of the S protein (72). Further studies are warranted to determine if the two 
samples from Pomona roundleaf bats contained potentially novel CoVs closely related to 


BatCoV HKU10 or variants of BatCoV HKUIO due to interspecies transmission. 
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LEGENDS TO FIGURES 

FIG 1 Map showing five locations of bat sampling in four autonomous prefectures (AP) in 
Yunnan Province, China. Sampling locations in Yunnan are in red. The location of SARSr-Rs- 
BatCoV strains, Rs3367 and RsSHC014, detected in a previous study (42) is in blue. 

FIG 2 Phylogenetic analysis of the nt sequences of the 267-nt fragment of RdRp of the 46 
positive samples identified in bats in Yunnan in this study. The tree was constructed by 
maximum likelihood method with the model GTR+G. Bootstrap values were calculated from 
1000 trees and only values 7700 are shown and given at nodes. The scale bar indicates 5 nt 
substitutions per site. The two SARSr-Rf-BatCoV strains YNLF 31C and YNLF 34C are in 
red. The potentially novel bat CoVs are in purple. AntelopeCoV, sable antelope coronavirus 
(ЕЕ424621); BatCoV | CDPHEI5/USA/2006, Bat coronavirus CDPHE15/USA/2006 
(NC 022103.1) BatCoV/SC2013,  Betacoronavirus/SC2013  (KJ473821.1); Erinaceus 
CoV/VMC/DEU/2012,Betacoronavirus Erinaceus/VMC/DEU/2012(NC. 022643); BCoV, bovi 

ne coronavirus (NC. 003045); BdHKU22, bottlenose dolphin coronavirus HKU22 (KF793826); 
BuCoV НКО, bulbul coronavirus НКІ/11 (FJ376619); BWCoV SWI, beluga whale 
coronavirus SWI (NC 010646); CCoV, Canine coronavirus strain CCoV/NTU336/F/2008 
(GQ477367.1); CCRCoV, Canine respiratory coronavirus strain K37 (JX860640.1); CmCoV 
HKU21, common moorhen coronavirus HKU21 (NC_016996);CoV Neoromicia/PML- 
PHEI/RSA/2011, coronavirus Neoromicia/PML-PHE1/RSA/2011 (KC869678); DcCoV 
HKU23,dromedary camel coronavirus HKU23 (КЕ906251); ECoV, equine coronavirus 
(NC 010327); FIPV, feline infectious peritonitis virus (AY994055); GiCoV, Giraffe 
coronavirus US/OH3-TC/2006 (EF424622.1); HCoV-229E, human coronavirus 229E 


(NC_002645); HCoV-HKUI, human coronavirus HKU1 (МС 006577); HCoV-NL63, human 
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coronavirus NL63 (NC_005831);HCoV-OC43, human coronavirus ОСАЗ(МС 005147); Hi- 
batCoV HKU10, Hipposideros bat coronavirus HKU10 (JQ989269);IBV-beaudette, beaudette 
coronavirus (AY692454); Human MERS-CoV, middle east respiratory syndrome 
coronavirus(NC_019843.3); Human MERS-CoV EMC/2012, Human  betacoronavirus 
2c EMC/2012 (7Х869059.2); Camel MERS-CoV KSA-CAMEL-363, middle east respiratory 
syndrome coronavirus isolate KSA-CAMEL-363 (KJ713298); MRCoV HKU18,magpie robin 
coronavirus HKU18(NC_016993); BatCoV 1A, Miniopterus bat coronavirus 1A (NC_010437); 
BatCoV 1B,Miniopterus bat coronavirus IB(NC 010436); Mi-batCoV HKUT, Miniopterus bat 
coronavirus HKU7 (00249226); Mi-batCoV HKUS, Miniopterus bat coronavirus HKU8 
(МС 010438); Mink CoV strain WD1127, Mink coronavirus strain WD1127 (NC 023760.1); 
MunCoV HKUI3, munia coronavirus HKUI3 (FJ376622);MHV-A59, murine hepatitis 
virus(:NC 001846); My-batCoV HKU6, Myotis bat coronavirus HKU6 (00249224); NH CoV 
HKU19,night heron coronavirus HKU19 (NC_016994);PEDV, porcine epidemic diarrhoea 
virus (МС 003436);  PHEV,porcine haemagglutinating ^ encephalomyelitis virus 
(NC 007732);Pi-BatCoV-HKUS-1, Pipistrellus bat coronavirus HKU5 (NC 009020); PorCoV 
HKU15, porcine coronavirus HKU15 (NC 016990); PRCV, porcine respiratory coronavirus 
(00811787); RbCoV HKUIA, rabbit coronavirus HKU14 (NC 017083); RatCoV parker, rat 
coronavirus parker(NC 012936); Rs-batCoV HKU2, Rhinolophus bat coronavirus HKU2 
(ЕЕ203064); Ro-batCoV-HKU9, Rousettus bat coronavirus HKU9 (NC 009021); Ro-batCoV 
HKUI10, Rousettus bat coronavirus HKU10 (JQ989270);Human SARS-CoV TOR2, SARS- 
related human coronavirus(NC_004718); Civet SARS-CoV 5716, SARS-related palm civet 
coronavirus (AY304488); Badger SARS-CoV, SARS-related badger coronavirus 


СЕВ/57/94/03 (AY545919.1); SARSr-Rs-batCoV HKU3, SARS-related Rhinolophus bat 
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coronavirus HKU3 (00022305); Scotophilus BatCoV 512,Scotophilus bat coronavirus 512 
(NC 009657); SpCoV HKU17, sparrow coronavirus HKU17 (NC 016992); TCoV, turkey 
coronavirus(NC 010800); TGEV, transmissible gastroenteritis virus (DQ443743); ThCoV 
HKU12, thrush coronavirus HKU12 (FJ376621);Ty-BatCoV-HKU4-1, Tylonycteris bat 
coronavirus HKU4 (NC_009019);WECoV  HKUI6, white-eye coronavirus HKUI6 
(NC_016991);WiCoV HKU20, wigeon coronavirus HKU20 (NC 016995). 

FIG 3 Multiple alignment of the amino acid sequences of the receptor-binding motifs of the 
spike proteins of human and civet SARSr-CoV and the corresponding sequences of SARSr- 
BatCoVs in different Rhinolophus species. Asterisks indicate positions that have fully 
conserved residues. Amino acid deletions among some SARSr-BatCoVs are highlighted yellow. 
The five critical residues for receptor binding in human SARS-CoV, at positions 
442,472,479,487,491, are highlighted pink. 

FIG 4 Phylogenetic analyses of nsp2, nsp3, nsp5, RdRp, 5, ORF3, ORF8 and N nucleotide 
sequences of SARSr-BatCoVs from different bat species. The trees were constructed by the 
maximum likelihood method using (A) GTR+G; (B) GTR+G; (C) GTR+G+I; (D) TN93+G; (Е) 
GTR+G; (F) TN93+G (G) T92 +G (H) GTR+G substitution models respectively and bootstrap 
values calculated from 1000 trees. Except for ORF3 and ORF8, all trees were rooted using 
corresponding sequences of HCoV НКІЛ (GenBank accession number МС 006577). Only 
bootstrap values >70% are shown. (A) 1736 nt (B) 5019 nt (C) 908 nt (D) 2777 nt (E) 3638 nt 
(F) 804 nt (G) 345 nt (Н) 1222 nt positions respectively were included in the analyses. The 
scale bars represent (A) 50 (B) 10 (C) 20 (D) 20 (E) 10 (F) 20 (zG) 10 (H) 200 substitutions per 


site respectively. Human and civet SARSr-CoVs are in green, SARSr-Rs-BatCoVs from R. 
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sinicus are in blue and SARSr-Rs-BatCoVs from R. ferrumequinum аге in red. The two SARSr- 
Rf-BatCoV strains YNLF ЗІС and YNLF 34C detected in this study are bolded. 

FIG 5 (A) Bootscan (upper panel) and Simplot (lower panel) analysis using the genome 
sequence of civet SARSr-CoV strain SZ03 as the query sequence. Bootscanning was conducted 
with Simplot version 3.5.1 (F84 model; window size, 1000 bp; step, 200 bp) on a gapless nt 
alignment, generated with ClustalX. The red line denotes SARSr-Rf-BatCoV strain YNLF 31C, 
the blue line denotes SARSr-Rs-BatCoV strain Rs3367 and the black line denotes SARSr-Rs- 
BatCoV strain HKU3-1. Тһе ORF8 region with potential recombination is highlighted yellow. 
(B) Multiple alignment of nt sequences from genome position 27000 to 28700. Bases conserved 
between civet SARSr-CoV 5703 and SARSr-Rf-BatCoVs (strains YNLF 31C and Rf1) are 
marked in yellow boxes. Bases conserved between civet SARSr-CoV SZ03 and SARSr-Rs- 
BatCoVs (strains Rs3367 and HKU3-1) are marked in green boxes. The 29-nt deletion in 
human SARS coronavirus TOR2 is highlighted orange The start codon and stop codon of ORF8 
are labelled with black boxes. 

FIG 6 Estimation of tMRCA of SARSr-CoVs based on ORF lab (А) and nsp5 (B). The mean 
estimated dates were labeled. The taxa were labeled with their sampling dates. 

FIG 7 SARSr-Rf-BatCoV YNLF31C mRNA leader-body junction and flanking sequences. The 
subgenomic ORF8 mRNA sequences are shown in alignment with the leader and the genomic 
sequence. The start codon AUG in subgenomic RNA is depicted in red. The putative TRS is 
depicted in boldface type and underlined. Identical bases between leader sequence and 
subgenomic mRNA sequence are in blue. Identical bases between genome and subgenomic 


mRNA sequences are in green. 


44 


Accepted Manuscript Posted Online 














>- 

D 
ue] 
[9] 
t 
> 
E. 
о 
Сб 
= 
= 
= 
[e] 
-9 





898 Table 1. Detection of CoVs in different bat species by RT-PCR of the 440-bp fragment of RdRp gene 
899 = - - 
Scientific name Common name No. of bats No. of bats CoV detected/closest Nt Sampling 
tested positive for CoV match in GenBank identity to closest location of 
match (%) positive bats 
Rhinolophus luctus Woolly horseshoe 32 0 - - - 
bat 
Rhinolophus affinis Intermediate 22 0 - - - 
horseshoe bat 
Rhinolophus ferrumequinum Greater horseshoe 11 2 SARS-CoV (2) 100 Lufeng 
bat 
Rhinolophus stheno Lesser brown 34 1 Rs-BatCoV HKU2 (1) 92 Mojiang 
horseshoe bat 
Š 
me) Hipposideros pomona Pomona roundleaf 17 2 Hi-BatCoV HKUIO (2) 81-87 Mojiang 
° bat 
> . „ү | i 
E Myotis daubentonii Daubenton's bat 98 32 My-BatCoV HKU6 (24) 78-99 Xiangyun 
ES Rs-BatCoV НКО? (1) 93 Mojiang 
S Rs-BatCoV НКО? (4) 80-81 Xiangyun 
-9 Mi-BatCoV НКО? (2) 96 Mojiang 
Mi-BatCoV HKUS (1) 96 Mojiang 
Rousettus leschenaulti Leschenault's 115 9 Ro-BatCoV HKU? (9) 75-79 Mengla 
rousette 
Unknown bat species 19 0 - - - 
45 


900 Table 2. Percentage amino acid identities of the selected predicted gene products of SARSr-CoVs to civet SARSr-CoV strain 573 
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901 
nsp2  nsp3 nsp5  nspl2 S  ORF3 E M ORF8* N 
Civet SARSr-CoV civet007 99.5 995 1000 997 98.6 98.1 1000 1000 983 909 
Civet SARSr-CoV SZ16 100.0 99.9 100.0 99.9 99.9 100.0 100.0 100.0 983 100.0 
Human SARS-CoV В/01 99.8 99.6 1000 99.9 988 981 1000 995 382 903 
Human SARS-CoV GZ02 99.8 998 100.0 999 99.0 978 1000 995 983 104 
Human SARS-CoV Tor2 99.8 996 100.0 99.9 986 981 1000 995 373 1000 
SARSr-Rs-BatCoV Rs3367 978 968 1000 996 923 967 991 977 322 905 
SARSr-Rs-BatCoV RsSHC014 983 968 99.7 996 901 967 991 977 330 995 
SARSr-Rs-BatCoV WIVI 978 968 997 995 923 963 996 977 322 906 
SARSr-Rs-BatCoV HKU3-1 90.6 917 993 986 779 813 974 982 314 %% 
SARSr-Rs-BatCoV HKU3-2 906 917 993 986 778 813 965 982 314 967 
SARSr-Rs-BatCoV HKU3-3 906 917 993 986 779 813 961 982 314 908 
>- 
rep) SARSr-Rs-BatCoV HKU3-6 906 917 993 985 780 813 974 982 314 964 
[e] 
Е) SARSr-Rs-BatCoV HKU3-8 90.0 917 990 988 781 817 974 964 232 909 
ы 
> SARSr-Rs-BatCoV HKU3-12 904 917 993 980 781 817 974 982 314 FD 
“6 SARSr-Rs-BatCoV HKU3-13 906 912 993 986 780 810 974 982 314 964 
5 SARSr-Rs-BatCoV Rs672/2006 983 871 993 997 780 894 987 982 322 Ф| 
Е 
5 SARSr-Rb-BatCoV ВМ48-31/ВСВ 70.8 75.9 944 977 748 694 965 894 872 
EO SARSr-Rm-BatCoV 279/2005 896 903 997 991 786 $32 974 968 317 99 
SARSr-Rm-BatCoV Rm1 89.5 900 993 924 787 832 978 968 330 9r 
SARSr-Rp-BatCoV Rp3 967 95.1 997 928 784 83.2 996 968 330 979 


SARSr-Rp-BatCoV Ер/бһаапхі2011 93.6 93.0 100.0 92.3 790 821 900 964 330 91%. 


SARSr-Cp-BatCoV Cp/Yunnan2011 90.8 97.5 100.0 922 789 894 970 986 314 981 


SARSr-Rf-BatCoV RfI 90.1 920 997 91.6 76.5 857 96.1 973 80.4 915 
SARSr-Rf-BatCoV 273/2005 89.8 923 997 984 76.6 85.7 987 973 804 of 
SARSr-Rf-BatCoV YNLF 31C 950 971 997 99.5 77.3 86.8 974 982 813 98.1 

SARSr-Rf-BatCoV YNLF 34C 95.0 971 997 990 77.3 86.8 99.1 982 813 9797 





918 “Тһе high amino acid identities in nsp3 and ORF8 between SARSr-Rf-BatCoVs апа civet SARSr-CoV аге in bold. 
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919 Table 3. Non-synonymous and synonymous substitution rates in the coding regions of SARSr-CoVs among different hosts 
































SARSr-Rf-BatCoV SARSr-Rs-BatCoV Civet SARSr-CoV Human SARS-CoV 
(п-4) (п-17) (п-18) (п-122) 

Gene Ка Ks Ka/Ks gene Ка Ks Ka/Ks gene Ka Ks Ka/Ks* gene Ka Ks Ka/Ks 

nspl 0.013 0.081 0.161 nspl 0.003 0.108 0.028 nspl 0.000 0.000 = пзр1 0.000 0.000 - 
nsp2 0.036 0.349 0.103 nsp2 0.023 0.230 0.100 nsp2 0.001 0.003 0.333 nsp2 0.000 0.00 0.000 
nsp3 0.030 0.414 0.073 nsp3 0.018 0.288 0.063 nsp3 0.001 0.002 0.500 nsp3 0.004 0.005 0.800 
nspá 0012 0391 0.031 nsp4 0.010 0.222 0.045 nsp4 0.001 0.002 0.500 nsp4 0.002 0.002 1.000 
пвр5 0.003 0.442 0.007 пвр5 0.004 0.244 0.016 пвр5 0.001 0.000 = п8р5 0.000 0.00 0.000 
nsp6 0.009 0.331 0.027 nsp6 0.005 0.178 0.028 nsp6 0.000 0.002 0.000 nsp6 0.002 0.00 2.000 
nsp7 0.018 0.549 0.033 пвр7 0.000 0.181 0.000 nsp7 0.002 0.000 = nsp7 0.000 0.00 0.000 

nsp8 0.004 0.249 0.016 пѕр8 0.003 0.175 0.017 nsp8 0.001 0.000 = пвр8 0.000 0.000 = 

nsp9 0.000 0.199 0.000 nsp9 0.003 0.199 0.015 nsp9 0.001 0.000 Е nsp9 0.001 0.000 = 
nspl0 0.011 0.355 0.031 пѕр10 0.000 0.158 0.000 nspl0 0.000 0.000 = nspl0 0.002 0.002 1.000 
>. nspl2 0.038 0.109 0.349 nspl2 0.026 0.076 0.342 nspl2 0.000 0.003 0 nspl2 0.001 0.00 1.000 
D nspl3 0.002 0.347 0.006 nspl3 0.002 0.199 0.010 nspl3 0.000 0.003 0 nspl3 0.001 0.00 1.000 
Е) nspl4 0.006 0.485 0.012 nspl4 0.005 0.270 0.019 nspl4 0.001 0.003 0.333 nspl4 0.001 0.00 1.000 

Е nspl5 0.016 0.452 0.035 nspl5 0.012 0.275 0.044 nspl5 0.000 0.000 = nspl5 0.000 0.00 0 
f nspl6 0.008 0.306 0.026 nspl6 0.005 0.277 0.018 nspl6 0.002 0.002 1.000 nspl6 0.002 0.003 0.667 
1% 5 0012 0174 0.070 5 0.049 0412 0119 5 0.003 0.001 3.000 5 0.001 0.002 0.500 
Б ORF3 0012 0.065 0.185 ORF3 0.041 0.220 0.186 ORF3 0.002 0.001 2.000 ORF3 0.072 0.386 0.187 
Б Е 0.015 0.070 0.214 E 0.003 0.037 0.081 E 0.000 0.000 = E 0.001 0.002 0.500 
-9 M 0.003 0096 0.313 M 0.007 0.097 0.072 M 0.001 0.002 0.500 M 0.002 0.001 2.000 
ORF8 0.021 0.110 0.190 ORF8 0035 0.197 0.178 ОКЕ” 0.004 0.000 = ORF8° 0.007 0.002 3.500 

N 0.015 0.143 0.105 N 0.008 0.069 0.116 N 0.002 0.005 0.400 N 0.000 0.001 0 








920  "Ka/Ks ratios of 20.5 are in bold. 
921 "Only ОВЕЗ sequences without deletions were included in analysis. 
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ы MunCoV HKU13 
MRCoV HKU18 
ThCoV HKU12 
WECoV HKU16 
PorCoV HKU15 
% SpCoV HKU17 
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CRCoV 

ECoV 

BCoV 

GiCoV 
‘Antelope CoV 
DcCoV HKU23 
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Amino acid residue position 442 


NYKYRYLRHGKLRPFERDIS 
NYKYRYLRHGKLRPFERDIS 


Human SARS-CoV 

Human SARS-CoV 

Human SARS-CoV 

Civet SARS-CoV 

Civet SARS-CoV 

SARSr-Rs-BatCoV 
SARSr-Rs-BatCoV 
SARSr-Rs-BatCoV 
SARSr-Ra-BatCoV 
SARSr-Rm-BatCoV 
SARSr-Rm-BatCoV 
SARSr-Rp-BatCoV 
SARSr-Cp-BatCoV 
SARSr-Rp-BatCoV 
SARSr-Rs-BatCoV 
SARSr-Rb-BatCoV 
SARSr-Rs-BatCoV 
SARSr-Rs-BatCoV 
SARSr-Rf-BatCoV 
SARSr-Rf-BatCoV 
SARSr-Rf-BatCoV 
SARSr-Rf-BatCoV 
Clustal Consens 


В201 

TOR2 

6202 

523 

civet007 
К53367 
RsSHC014 

WIV1 

LYRall 

Rmi 

279/2005 

Rp3 
Cp/Yunnan2011 
Rp/Shaanxi2011 
Rs672/2006 
BM48-31/BGR/2008 
HKU3-1 

HKU3-7 

Red 

273/2005 
YNLF 31С 
YNLF 34С 

us 





NTRNID 


YKYRYLRHGKLRPFERDIS. 
YKYRYLRHGKLRPFERDIS. 


NYKYRYLRHGKLRPFERDISN 


YKYRSLRHGKLRPFERDIS. 


YLYRWVRRSKLNPYERDLS. 
YNYKYRSLRHGKLRPFERDISN 


'"NYKYRSLRHGKLRPFERDIS 
QYYYRSYRKEKLKPFERDLS 
QYYYRSYRKEKLKPFERDLS 
QYYYRSHRKTKLKPFERDLS 
QYYYRSSRKTKLKPFERDLS 
QYYYRSSRKEKLKPFERDLS 
QYYYRSSRKTKLKPFERDLT 

NEFFYRRFRHGKIKPYGRDLS 
NYYYRSHRKTKLKPFERDLS 
YYYRSHRKTKLKPFERDLS 
SYFYRSHRSSKLKPFERDLS 
SYFYRSHRSSKLKPFERDLS 
SYFYRSHRSSKLKPFERDLS 
SYFYRSHRSSKLKPFERDLS 


хх * & хх 


—— 
5 a.a deletion 


12 a.a deletion 


472 479 


PPALNCYWPLNDYGFYTTTGIGYQPYR 


487 491 


; редовите тера Нитап 


PPALNCYWPLNDYGFYTITGIGYQPYR 










(Es уны UEM ne nd Civet 


PPAPNCYWPLRGYGFYTTSGIGYQPYR 
PPAENCYWPLNDYGFYITNGIGYOPYR 


f RVGENCYNPLRPYGFFTTASVENPYR | R.sinicus 
'TPPAFNCYWPLNDYGFYITNGIGYQPYR 


PPAFNCYWPLNDYGFYTTNGIGYOPYR R.affinis 
SDE-NGVYTLSTYDFYPSIPVEYQATR 


SDE-NGVYTLSTYDFYPSIPVEYQATR } R.macrotis 


SDE-NGVRTLSTYDFYPSVPVAYQATR R.pearsoni 
SDE-NGVRTLSTYDFYPSVPLEYOATR Chaerephon plicata 
SDE-NGVYTLSTYDFYPSVPLDYQATR  R.pusillus 
SDE-NGVRTLSTYDFYPNVPIEYOATR  R.sinicus 


AEGLNCYKPLASYGFTOSSGIGEOPYR _ R.blasii 
EB conl аа | R.sinicus 
SDDGNGVYTLSTYDFNPNVPVAYOATR = 
SEE-NGVRTLSTYDFNONVPLEYOATR 
SVE-ENGRTLSTYDFNONVPLEYOATR 
SEE-NGARTLSTYDFNONVPLEYOATR 
SEE-NGARTLSTYDFNQNVPLEYQATR 
* Е ж 


EE 


R.ferrumequinum 
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SARSr-Rs-BatCoV HKU3 strains 1-12/R.sinicus 


SARSr-Rm-BatCoV 279/2005 /R.macrotis 
SARSr-Rm-BatCoV Rm1/R.macrotis 

SARSr-Rp-BatCoV Rp/Shaanxi2011 /R.pusillus 

SARSr-Rf-BatCoV Rf1/R.ferrumequinum 

o! SARSr-Rf-BatCoV 273/2005 /R.ferrumequinum 
SARSr-Rf-BatC oV YNLF 31C/R.ferrumequinum 

SARSr-Rf-BatC oV YNLF 34C/R.ferrumequinum 

SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 

SARSr-Rs-BatCoV RsSHCO14/R.sinicus 
SARSr-Rp-BatCoV Rp3/R.pearsoni 

SARSr-Rs-BatCoV WIV1 

SARSr-Rs-BatCoV Rs3367/R.sinicus 

Human SARS-CoV TOR2 

Human SARS-CoV BJ01 

Civet SARS-CoV civet007 

Human SARS-CoV GZ02 

Civet SARS-CoV SZ16 

Civet SARS-CoV SZ3 

SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 

SARS-Cp-BatCoV Cp/Y unnan201 1 /Chaerephon plicata 


0.02 

















SARSr-Rs-BatCoV HKU3 strains 1-12/R.sinicus 


SARSr-Rm-BatCoV Rm1/R.macrotis 

o ! SARSr-Rm-BatCoV 279/2005 /R.macrotis 
SARSr-Rp-BatCoV Rp/Shaanxi2011 /R.pusillus 
SARSr-Rf-BatCoV 273/2005 /R.ferrumequinum 
SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
SARSr-Cp-BatCoV Cp/Yunnan201 1/Chaerephon plicata 


SARSr-Rf-BatCoV YNLF 34C/R. ‘ferrumequinum 
SARSr-Rp-BatCoV Rp3/R.pearsoni 
SARSr-Rs-BatCoV Rs3367/R.sinicus 
SARSr-Rs-BatCoV RsSHC014/R.sinicus 

92 | SARSr-Rs-BatCoV WIV1 

SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
Human SARS-CoV GZ02 

Human SARS-CoV BJ01 

Civet SARS-CoV SZ16 

99 | Human SARS-CoV TOR2 

Clvet SARS-CoV civet007 

Civet SARS-CoV SZ3 

SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 
















SARSr-Rs-BatCoV HKU3 strains 1-12/R.sinicus 


SARSr-Rp-BatCoV Rp3/R.pearsoni 
SARSr-Rm-BatCoV Rm1/R.macrotis 

01 SARSr-Rm-BatCoV 279/2005/R.macrotis 

100, SARSr-Rf-BatCoV YNLF 31C/R.Ferrumequinum 
SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
SARSr-Rf-BatCoV 273/2005/R.ferrumequinum 


SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
SARSr-Rp-BatCoV Rp/Shaanxi2011 /R.pusillus 
100, SARSr-Rs-BatCoV Rs3367/R.sinicus 
100 | SARSr-Rs-BatCoV WIV1 
SARSr-Rs-BatCoV RsSHC014/R.sinicus 
98 Civet SARS-CoV 523 
Civet SARS-CoV SZ16 
Civet SARS-CoV civet007 
168 Human SARS-CoV GZ02 
Human SARS-CoV BJ01 
99 Human SARS-CoV TOR2 


= 
0.1 


Civet SARS-CoV civet007 

Civet SARS-CoV SZ16 

Civet SARS-CoV SZ3 

Human SARS-CoV GZ02 

99, SARSr-Rf-BatCoV YNLF 31C/R.ferrumequinum 
SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
99 |, SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
SARSr-Rf-BatCoV 273/2005 /R.ferrumequinum 










ORF8 


9, SARSr-Rs-BatCoV WIV1 

o | SARSr-Rs-BatCoV Rs3367/R.sinicus 
SARSr-Rs-BatCoV RsSHC014/R.sinicus 
SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
SARSr-Rp-BatCoV Rp/Shaanxi201 1/R.pusillus 
SARSr-Rp-BatCoV Rp3/R.pearsoni 
SARSr-Rm-BatCoV Rm1/R.macrotis 


SARSr-Rs-BatCoV HKU3 strains 1-13/R.sinicus 


99 


"SARSr-Cp- -BatCoV Cp/Yunnan201 1/Chaerephon plicata 


SARSr-Rb-BatCoV BM48-31/BGR/R. blasii 


SARSr-Cp-BatCoV Cp/Yunnan201 1/Chaerephon plicata 


























SARSr-Rs-BatCoV HKU3 strains 1-12/R.sinicus 


SARSr-Rp-BatCoV Rp/Shaanxi2011 /R.pusillus 
SARSr-Rf-BatCoV 273/2005 /R.ferrumequinum 

o  SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
SARSr-Rm-BatCoV Rm1/R.macrotis 
SARSr-Rm-BatCoV 279/2005/R.macrotis 

SARSr- Rp-BatCoV Rp3/R.pusillus 

SARSr-Rf-BatCoV YNLF 31C/R.ferrumequinum 
SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
SARSr-Rs-BatCoV WIV1 

SARSr-Rs-BatCoV Rs3367/R.sinicus 
SARSr-Rs-BatCoV RsSHCO!14/R.sinicus 
SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
SARSr-Cp-BatCoV Cp/Y unnan2011/Chaerephon plicata 
Civet SARS-CoV civet007 

Human SARS-CoV TOR2 

Human SARS-CoV ВЈО1 

Human SARS-CoV GZ02 

Civet SARS-CoV SZ3 

Civet SARS-CoV SZ16 

SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 


SARSr-Rf- BatCoV 273/2005 /R. md don 
SARSr-Rp-BatCoV Rp/Shaanxi201 1 /R.pusillus 
SARSr-Cp-BatCoV Cp/Yunnan2011 /Chaerephon plicata 
90, SARSr-Rs-BatCoV WIV1 

93| SARSr-Rs-BatCoV Rs3367/R.sinicus 
SARSr-Rs-BatCoV RsSHC014/R.sinicus 
SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
SARSr-Rp-BatCoV Rp3/R.pearsoni 
SARSr-Rf-BatCoV YNLF 31C/R.ferrumequinum 
SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
Civet SARS-CoV civet007 

Civet SARS-CoV SZ16 

Civet SARS-CoV SZ3 

Human SARS-CoV TOR2 

Human SARS-CoV GZ02 

Human SARS-CoV BJ01 

SARSr-Rm-BatCoV Rm1/R.macrotis 

99! SARSr-Rm-BatCoV 279/2005 /R.macrotis 
SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 


Е 
0.05 







SARSr-Rs-BatCoV HKU3 strains 1-13/ R.sinicus 


SARSr-Rp-BatCoV Rp/Shaanxi2011 /R.pusillus 
SARSr-Rp-BatCoV Rp3/R.pearsoni 
SARSr-Rm-BatCoV Rm1/R.macrotis 
o! SARSr-Rm-BatCoV 279/2005 /R.macrotis 
SARSr-Cp- -BatCoV Cp/Yunnan2011 /Chaerephon plicata 
SARSr-Rs-BatCoV Rs672/2006 /R.sinicus 
8 00 , SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
ро [Í SARSr-Rf BatCoV 273/2005 /R.ferrumequinum 
SARSr-Rf-BatCoV YNLF 31C/R.ferrumequinum 
oo! SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
SARSr-Rs-BatCoV Rs3367/R.sinicus 
93! SARSr-Rs-BatCoV WIV1 
SARSr-Rs-BatCoV RsSHC014/R.sinicus 
Human SARS-CoV GZ02 
Human SARS-CoV BJ01 
95} Human SARS-CoV Тог2 
82) Civet SARS-CoV civet007 
оо] Civet SARS-CoV 5216 
74 Civet SARS-CoV 523 
SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 


ORF3 










I 
0.05 



















“ш 


754 SARSr-Rs-BatCoV HKUS strains 1-13/R.sinicus 
SARSr-Rs-BatCoV Rs672/2006/R.sinicus 
SARSr-Rs-BatCoV RsSHCO!14/R.sinicus 
SARSr-Rs-BatCoV Rs3367/R.sinicus 
SARSr-Rs-BatCoV WIV1 

SARSr-Rf-BatCoV YNLF 31C/R.ferrumequinum 
SARSr-Rf-BatCoV YNLF 34C/R.ferrumequinum 
Human SARS-CoV GZ02 

Civet SARS-CoV civet007 

Human SARS-CoV TOR2 

Civet SARS-CoV SZ3 

Civet SARS-CoV SZ16 

Human SARS-CoV BJ01 

SARSr-Rp-BatCoV Rp/Shaanxi201 1/R.pusillus 
SARSr-Rm-BatCoV Rm1/R.macrotis 
SARSr-Rm-BatCoV 279/2005/R.macrotis 
SARSr-Rp-BatCoV Rp3/R.pearsoni 

9, SARSr-Rf-BatCoV 273/2005/R.ferrumequinum 
SARSr-Rf-BatCoV Rf1/R.ferrumequinum 
SARSr-Rb-BatCoV BM48-31/BGR /R.blasii 
SARSr-Cp-BatCoV Cp/Yunnan201 1/Chaerephon plicata 


91 
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% of Permuted Trees 


Pos 


Similarity 


оооо 9000 seso 
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BootScan - Query: Civet SARS-CoV 573 


























1 УЕ 316 
| —HKU3-1 
| | —Rs3367 
| | 
| | 
| | | | | | 
А \ v . X А "j 
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SimPlot - Query: Civet SARS-CoV SZ3 
0. {> ее жаса дей —YNLF 316 
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27000 27010 27020 27030 27040 27050 27060 27070 27080 27090 
SARSr-Rf-BatCoV YNLF 31С----..-. iss E M DNE, cosi, МА asss 
SARSr-Rf-BatCoV Rfl X HC бач 8 
Civet SARS-CoV S23 TTTGCTGCATACAACCGCTACCGTATTGGAAACTATARATTAR, CAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAGTAAGTGACAA 
Human SARS-CoV 6202 . e * лала ° H 


Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 





1 1 pert ^ 
27100 27110 27120 27130 27140 27150 27160 27170 27180 27190 


SARSr-RÍ-BatCoV YNLF 31C ---7- d и 
SARSr-Rf-BatCoV Rf1 - 
Civet SARS-CoV 523 
Human SARS-CoV G202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 





----.-......Х.... 


















ПИКИ n ИИТ 


' 
27220 27230 27240 27250 27260 27270 27280 27290 





SARSr-Rf-BatCoV YNLF 31C 
SARSr-Rf-BatCoV Rfl 
Civet SARS-CoV S23 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV НКОЗ-1 


SARSr-Rf-BatCoV YNLF 31C 
SARSr-Rf-BatCoV Rf1 
Civet SARS-CoV 523 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 






КЕЗЕКТІ ' М 
27400 27410 27420 27430 27440 


Ir eee — 






iieri 
27480 27490 
SARSr-Rf-BatCoV YNLF 31C м 
SARSr-Rf-BatCoV Rfi 
Civet SARS-CoV 523 
Human SARS-CoV GZ02 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 
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i i ни: 
27550 27560 27570 25586 27590 


Я 


27520 





1 
27510 27530 27540 







SARSr-Rf-BatCoV YNLF 31¢ - 
SARSr-Rf-BatCoV Rfl 
Civet SARS-CoV 523 

Human SARS-CoV G202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 





1 
27600 





27610 
SARSr-Rf-BatCoV YNLF 31C 


SARSr-Rf-BatCoV Rfl :6...... 
Civet SARS-CoV 823 CACTTTTTCT анан GTAPTITTAATACTTR СТТСАССАТТ 
Human SARS-CoV 6202 m Нуран 





Human SARS-CoV TOR2 ОЕ 
SARSr-Rs-BatCoV Rs3367 -6... 
SARSr-Rs-BatCoV HKU3-1 





"ti i itn i i i i i ЕКЕН 
27700 "27110 27750 27760 27770 


SARSr-Rf-BatCoV YNLF 31С ----- 
SARSr-Rf-BatCoV Rf1 
Civet SARS-CoV 523 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 





НОНО GEO Ub LEE CEDAR] 








Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 


Ü 27800 27810 27850 27860 27870 
lo] SARSr-Rf-BatCoV YNLF 31C . . 
— SARSr-Rf-BatCoV Rfi 
[e] Civet SARS-CoV S23 ORFS start/stop 
= Human SARS-CoV 6202 cod 
> Human SARS-CoV TOR2 ion 
SARSr-Rs-BatCoV Rs3367 
"6 SARSr-Rs-BatCoV HKU3-1 
[с SARSE-RE-BatCoV YNLF 31C Conserved nucleotide. 
c SARSr-RÍ-BatCoV Е. 
= Civet SARS-CoV 823 between. 
i Human SARS-CoV 6202 Civet SARS-CoV and: 
Ез 


[ 1 Rs-BatCoV strains 
С] Rf-BatCoV strains 


SARSr-Rf-BatCoV YNLF 31C 
SARSr-Rf-BatCoV Rfl 
Civet SARS-CoV S23 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 











SARSr-Rf-BatCoV YNLF 31С 
SARSr-Rf-BatCoV Rf1 
Civet SARS-CoV m 
Hoan ВО 652 х 29 nt deletion in 

Human SARS-CoV TOR2 5 h 

SARSr-Rs-BatCoV Rs3367 Ж » k + Human SARS-CoV TOR2 
SARSr-Rs-BatCoV HKU3-1 AGT.. T А 


EEG 
28200 28210 


SARSr-Rf-BatCoV YNLF SIUE ЖМ eec 
SARSr-Rf-BatCoV Rfl 
Civet SARS-CoV 523 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV НКОЗ-1 








1. 1 Д 1 1 ' t LI 1 t 1 ' gee ЦЫ 4 yee LI і 

28300 28310 28320 28330 28340 28380 28390 
SARSr-Rf-BatCoV ҮНІР 31С::--72..---.----....-.......... лана 435% жәнс ТЕ» qu3iga cis ғ адал» 
SARSr-Rf-BatCoV Rfi НОВ эй» .G. Пе СРР 
Civet SARS-CoV S23 GGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCCAATAATACIGCGTCTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTCCC 
















Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 








--А...... 
..А. .6. . . . .. 
ТЕПТЕН К ИЕ ИИК ринен ИЕ И ТЕК Т ТЕТЕ К 


28400 28410 28420 28430 28440 28450 28460 28470 28480 28490 
SARSr-Rf-BatCoV YNLF зас. Ben 


SARSr-Rf-BatCoV Rfl 
Civet SARS-CoV 523 Ёсссслссў BETICCAMTCARCACCAMTAGIGGICCAGATGACCAAATIGUCIACTACCOARGAGCTACCCGACGAGTTCOTOGIGOTGACOOCAAA 





.т........ 
















Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 


SARSr-Rs-BatCoV HKU3-1 





itt 
28500 28510 


SARSr-RÍ-BatCoV YNLF 31C- 
SARSr-Rf-BatCoV Rfi 
Civet SARS-CoV 523 
Human SARS-CoV 6202 
Human SARS-CoV TOR2 
SARSr-Rs-BatCoV Rs3367 
SARSr-Rs-BatCoV HKU3-1 





1 
28520 





28540 28550 


[ 
28530 
. SA: 4... 

















morc "un. ED I Dor 








КОО ; ri 
28610 28620 28630 28640 28650 28660 28670 28680 28690 
SARSr-Rf-BatCoV YNLF 31С..-........... aiena «жзеге т ——— евая 
SARSr-Rf-BatCoV Rfi tlt . 1........ 4.... mn TT. 44...... 
Civet SARS-CoV 523 А "TGGCACCCGCAATCCTAATAACAATGCTG! CCGTGCTACAACTTCCTCAAGGAACAACATT 







Human SARS-CoV GZ02 ot 
Human SARS-CoV TOR2 ... 
SARSr-Rs-BatCoV Rs3367 - 
SARSr-Rs-BatCoV HKU3-1 . 
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~1961 


ORFiab 


Civet SARS-CoV $Z3/2003 
Civet SARS-CoV 5216/2003 


Human SARS-CoV GZ02/2003 
Civet SARS-CoV civet007/2004 
Civet SARS-CoV civet010/2004 
Civet SARS-CoV PC4-227/2004 
Civet SARS-CoV GZ0402/2004 
Civet SARS-CoV GZ0401/2003 
Civet SARS-CoV PC4-136/2004 
Civet SARS-CoV PC4-13/2004 
Civet SARS-CoV civet020/2004 
Human SARS-CoV CUHKW 1/2003 
Human SARS-CoV CUHKSu10/2003 
Human SARS-CoV TW1/2003 


Human SARS-CoV CUHKAGO 1/2003 
Human SARS-CoV СУНКАС02/2003 












~2001 


~1991 





~1960 


SARSr-Rs-BatCoV Rs3367/2012 
SARSr-Rs-BatCoV WIV1/2012 


SARSr-Rs-BatCoV RsSHC014/2011 

SARSr-Rp-BatCoV Rp3/2004 

Г SARSr-Rf-BatCoV ҮМІЕ 31C/2013 
SARSr-Rf-BatCoV YNLF. 34C/2013 


SARSr-Rm-BatCoV Rm1/2004 
| — SARSr-Rf-BatCoV 273/2005 
SARSr-Rf-BatCoV Rf1/2004 
SARSr-Rs-BatCoV HKU3-6/2005 
SARSr-Rs-BatCoV HKU3-11/2007 


SARSr-Rs-BatCoV HKU3-9/2006 
SARSr-Rs-BatCoV HKU3-10/2006 

























SARSr-Rs-BatCoV HKU3-13/2007 
SARSr-Rs-BatCoV HKU3-5/2005 
SARSr-Rs-BatCoV HKU3-4/2005 
SARSr-Rs-BatCoV HKU3-2/2005 
SARSr-Rs-BatCoV HKU3-1/2005 
SARSr-Rs-BatCoV HKU3-3/2005 
SARSr-Rs-BatCoV HKU3-12/2007 
SARSr-Rs-BatCoV HKU3-7/2006 
SARSr-Rs-BatCoV HKU3-8/2006 















1960 


nsp5 


1970 1980 1990 2000 2010 


+ Human SARS-CoV CUHKAGO1/2003 
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